Text - Based Automatic Language Identification

نویسنده

H. P. Combrinck

چکیده

— We present a statistical approach to text-based automatic language identification that focuses on discrimination between as opposed to representation of different language models. The system is evaluated on a text corpus containing six African and six European languages.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Language Identification Based on High Frequency Approaches

This paper deals with the problem of automatic language identification of noisy texts, which represents an important task in natural language processing. Actually, there exist several works in this field, which are based on statistical and machine learning approaches for different categories of texts. Unfortunately, most of the proposed methods work fine on clean texts or long texts, but often ...

متن کامل

مدل دو مرحله ای شکاف- گلچین برای نمایه سازی خودکار متون فارسی

Purpose: Each language has its own problems. This leads to consider appropriate models for automatic indexing of every language. These models should concern the exhaustificity and specificity of indexing. This paper aims at introduction and evaluation of a model which is suited for Persian automatic indexing. This model suggests to break the text into the particles of candidate terms and to c...

متن کامل

Offline Language-free Writer Identification based on Speeded-up Robust Features

This article proposes offline language-free writer identification based on speeded-up robust features (SURF), goes through training, enrollment, and identification stages. In all stages, an isotropic Box filter is first used to segment the handwritten text image into word regions (WRs). Then, the SURF descriptors (SUDs) of word region and the corresponding scales and orientations (SOs) are extr...

متن کامل

New Features for Automatic Text Independent Language Identification

NEW FEATURES FOR AUTOMATIC TEXT INDEPENDENT LANGUAGE IDENTIFICATION A. Nagesh1 and V. Kamakshi Prasad2 1Mahatma Gandhi Institute of Technology, Hyderabad, India E-mail: [email protected] 2Jawaharlal Nehru Technological University, Hyderabad, India E-mail: [email protected] The objective of this paper is to explore new feature vectors for Automatic Text Independent Language Identif...

متن کامل

برچسب‌زنی خودکار نقش‌های معنایی در جملات فارسی به کمک درخت‌های وابستگی

Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1995

Text - Based Automatic Language Identification

نویسنده

چکیده

منابع مشابه

Language Identification Based on High Frequency Approaches

مدل دو مرحله ای شکاف- گلچین برای نمایه سازی خودکار متون فارسی

Offline Language-free Writer Identification based on Speeded-up Robust Features

New Features for Automatic Text Independent Language Identification

برچسب‌زنی خودکار نقش‌های معنایی در جملات فارسی به کمک درخت‌های وابستگی

عنوان ژورنال:

اشتراک گذاری